Error Entropy Minimization for LSTM Training

نویسندگان

  • Luís A. Alexandre
  • Joaquim Marques de Sá
چکیده

In this paper we present a new training algorithm for the Long Short-Term Memory (LSTM) recurrent neural network. This algorithm uses entropy instead of the usual mean squared error as the cost function for the weight update. More precisely we use the Error Entropy Minimization approach, were the entropy of the error is minimized after each symbol is present to the network. Our experiments show that this approach enables the convergence of the LSTM more frequently than with the traditional learning algorithm. This in turn relaxes the burden of parameter tuning since learning is achieved for a wider range of parameter values. The use of EEM also reduces, in some cases, the number of epochs needed for convergence.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An error-entropy minimization algorithm for supervised training of nonlinear adaptive systems

This paper investigates error-entropy-minimization in adaptive systems training. We prove the equivalence between minimization of error’s Renyi entropy of order and minimization of a Csiszar distance measure between the densities of desired and system outputs. A nonparametric estimator for Renyi’s entropy is presented, and it is shown that the global minimum of this estimator is the same as the...

متن کامل

Entropy minimization for supervised digital communications channel equalization

This paper investigates the application of error-entropy minimization algorithms to digital communications channel equalization. The pdf of the error between the training sequence and the output of the equalizer is estimated using the Parzen windowing method with a Gaussian kernel, and then, the Renyi’s quadratic entropy is minimized using a gradient descent algorithm. By estimating the Renyis ...

متن کامل

Comparison of Entropy and Mean Square Error Criteria in Adaptive System Training Using Higher Order Statistics

The error-entropy-minimization approach in adaptive system training is addressed in this paper. The effect of Parzen windowing on the location of the global minimum of entropy has been investigated. An analytical proof that shows the global minimum of the entropy is a local minimum, possibly the global minimum, of the nonparametrically estimated entropy using Parzen windowing with Gaussian kern...

متن کامل

Diversity encouraged learning of unsupervised LSTM ensemble for neural activity video prediction

Being able to predict the neural signal in the near future from the current and previous observations has the potential to enable real-time responsive brain stimulation to suppress seizures. We have investigated how to use an auto-encoder model consisting of LSTM cells for such prediction. Recognizing that there exist multiple activity pattern clusters, we have further explored to train an ense...

متن کامل

Approaches for Neural-Network Language Model Adaptation

Language Models (LMs) for Automatic Speech Recognition (ASR) are typically trained on large text corpora from news articles, books and web documents. These types of corpora, however, are unlikely to match the test distribution of ASR systems, which expect spoken utterances. Therefore, the LM is typically adapted to a smaller held-out in-domain dataset that is drawn from the test distribution. W...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006